point iteration
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
A.1 PyTorchpseudo-codeforMIRA Algorithm1PyTorchpseudo-codeofMIRA
In this subsection, we derive the necessary and sufficient condition in proposition??. Denote B,K be some natural numbers. We introduce the proposition from [8] that proves geometrical convergence of positive concave mapping. Bycorollary 2, g(v(n);Q) is a concave mapping. Wedonotapplyweightdecayanduse cosine scheduled the learning rate.
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
Positive concave deep equilibrium models
Gabor, Mateusz, Piotrowski, Tomasz, Cavalcante, Renato L. G.
Deep equilibrium (DEQ) models are widely recognized as a memory efficient alternative to standard neural networks, achieving state-of-the-art performance in language modeling and computer vision tasks. These models solve a fixed point equation instead of explicitly computing the output, which sets them apart from standard neural networks. However, existing DEQ models often lack formal guarantees of the existence and uniqueness of the fixed point, and the convergence of the numerical scheme used for computing the fixed point is not formally established. As a result, DEQ models are potentially unstable in practice. To address these drawbacks, we introduce a novel class of DEQ models called positive concave deep equilibrium (pcDEQ) models. Our approach, which is based on nonlinear Perron-Frobenius theory, enforces nonnegative weights and activation functions that are concave on the positive orthant. By imposing these constraints, we can easily ensure the existence and uniqueness of the fixed point without relying on additional complex assumptions commonly found in the DEQ literature, such as those based on monotone operator theory in convex analysis. Furthermore, the fixed point can be computed with the standard fixed point algorithm, and we provide theoretical guarantees of geometric convergence, which, in particular, simplifies the training process. Experiments demonstrate the competitiveness of our pcDEQ models against other implicit models.
- Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
- Asia > Japan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
Fixed Point Diffusion Models
Bai, Xingjian, Melas-Kyriazi, Luke
We introduce the Fixed Point Diffusion Model (FPDM), a novel approach to image generation that integrates the concept of fixed point solving into the framework of diffusion-based generative modeling. Our approach embeds an implicit fixed point solving layer into the denoising network of a diffusion model, transforming the diffusion process into a sequence of closely-related fixed point problems. Combined with a new stochastic training method, this approach significantly reduces model size, reduces memory usage, and accelerates training. Moreover, it enables the development of two new techniques to improve sampling efficiency: reallocating computation across timesteps and reusing fixed point solutions between timesteps. We conduct extensive experiments with state-of-the-art models on ImageNet, FFHQ, CelebA-HQ, and LSUN-Church, demonstrating substantial improvements in performance and efficiency. Compared to the state-of-the-art DiT model, FPDM contains 87% fewer parameters, consumes 60% less memory during training, and improves image generation quality in situations where sampling computation or time is limited. Our code and pretrained models are available at https://lukemelas.github.io/fixed-point-diffusion-models.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
- (7 more...)
- Research Report > Promising Solution (0.54)
- Overview > Innovation (0.34)
Probabilistic Control and Majorization of Optimal Control
Probabilistic control design is founded on the principle that a rational agent attempts to match modelled with an arbitrary desired closed-loop system trajectory density. The framework was originally proposed as a tractable alternative to traditional optimal control design, parametrizing desired behaviour through fictitious transition and policy densities and using the information projection as a proximity measure. In this work we introduce an alternative parametrization of desired closed-loop behaviour and explore alternative proximity measures between densities. It is then illustrated how the associated probabilistic control problems solve into uncertain or probabilistic policies. Our main result is to show that the probabilistic control objectives majorize conventional, stochastic and risk sensitive, optimal control objectives. This observation allows us to identify two probabilistic fixed point iterations that converge to the deterministic optimal control policies establishing an explicit connection between either formulations. Further we demonstrate that the risk sensitive optimal control formulation is also technically equivalent to a Maximum Likelihood estimation problem on a probabilistic graph model where the notion of costs is directly encoded into the model. The associated treatment of the estimation problem is then shown to coincide with the moment projected probabilistic control formulation. That way optimal decision making can be reformulated as an iterative inference problem. Based on these insights we discuss directions for algorithmic development.
- Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Control Systems (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
TorchDEQ: A Library for Deep Equilibrium Models
Geng, Zhengyang, Kolter, J. Zico
Deep Equilibrium (DEQ) Models, an emerging class of implicit models that maps inputs to fixed points of neural networks, are of growing interest in the deep learning community. However, training and applying DEQ models is currently done in an ad-hoc fashion, with various techniques spread across the literature. In this work, we systematically revisit DEQs and present TorchDEQ, an out-of-the-box PyTorchbased library that allows users to define, train, and infer using DEQs over multiple domains with minimal code and best practices. Using TorchDEQ, we build a "DEQ Zoo" that supports six published implicit models across different domains. By developing a joint framework that incorporates the best practice across all models, we have substantially improved the performance, training stability, and efficiency of DEQs on ten datasets across all six projects in the DEQ Zoo. TorchDEQ and DEQ Zoo are released as open source.
Deep Equilibrium Models Meet Federated Learning
Gkillas, Alexandros, Ampeliotis, Dimitris, Berberidis, Kostas
In this study the problem of Federated Learning (FL) is explored under a new perspective by utilizing the Deep Equilibrium (DEQ) models instead of conventional deep learning networks. We claim that incorporating DEQ models into the federated learning framework naturally addresses several open problems in FL, such as the communication overhead due to the sharing large models and the ability to incorporate heterogeneous edge devices with significantly different computation capabilities. Additionally, a weighted average fusion rule is proposed at the server-side of the FL framework to account for the different qualities of models from heterogeneous edge devices. To the best of our knowledge, this study is the first to establish a connection between DEQ models and federated learning, contributing to the development of an efficient and effective FL framework. Finally, promising initial experimental results are presented, demonstrating the potential of this approach in addressing challenges of FL.
- South America > Peru > Loreto Department (0.04)
- North America > Mexico > Gulf of Mexico (0.04)
- Europe > Middle East > Cyprus (0.04)
- Europe > Greece > West Greece > Patra (0.04)
Layer-Neighbor Sampling -- Defusing Neighborhood Explosion in GNNs
Balın, Muhammed Fatih, Çatalyürek, Ümit V.
Graph Neural Networks (GNNs) have received significant attention recently, but training them at a large scale remains a challenge. Mini-batch training coupled with sampling is used to alleviate this challenge. However, existing approaches either suffer from the neighborhood explosion phenomenon or have poor performance. To address these issues, we propose a new sampling algorithm called LAyer-neighBOR sampling (LABOR). It is designed to be a direct replacement for Neighbor Sampling (NS) with the same fanout hyperparameter while sampling up to 7 times fewer vertices, without sacrificing quality. By design, the variance of the estimator of each vertex matches NS from the point of view of a single vertex. Moreover, under the same vertex sampling budget constraints, LABOR converges faster than existing layer sampling approaches and can use up to 112 times larger batch sizes compared to NS.
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)